Data EngineeringTemporarily Remote,
        
                Ontario,
                Canada
                
                Halifax,
                Nova Scotia
        
            
        
        
        
        
            Description
                
                    Position at Ness Digital Engineering
                
            
                
Required Skills & Experience
Technical Expertise
- Strong hands-on expertise in Hadoop ecosystem (HDFS, Hive, Spark, Oozie, Yarn, HBase, Kafka, Zookeeper). 
- Deep understanding of data ingestion, transformation, and storage patterns in large-scale environments. 
- Experience with distributed computing, data partitioning, and parallel processing. 
- Proficiency in SQL, PySpark, Scala, or Java. 
- Familiarity with cloud-native data lakes on AWS (EMR, Glue, S3), Azure (HDInsight, ADLS, Synapse), or GCP (Dataproc, BigQuery). 
- Knowledge of data governance tools (Apache Atlas, Ranger, Collibra) and workflow orchestration tools (Airflow, Oozie). 
- Expertise in Data Warehousing and ETL processes, including Design, Development, Support, Implementation, and Testing.
- Hands on exp in Architecture, design including requirement analysis, performance tuning, data conversion, loading, extraction, transformation, and creating job pipelines.
- Strong knowledge of the Retail Domain and experience with various stages of data warehouse projects, including data extraction, cleansing, aggregation, validation, transformation, and loading.
- Exp in using DataStage components such as Sequential File, Join, Sort, Merge, Lookup, Transformer, Remove Duplicates, Copy, Filter, Funnel, Dataset, Change Data Capture, and Aggregator.
- Strong at database commands (DDL and DML) and data warehousing implementations models.
- Hands on exp with the Hadoop ecosystem, including HDFS, Hive, Sqoop, NiFi, and YARN.
- Familiar with Mainframe ESP for job scheduling.
- Implementation exp in  indexes, table partitioning, collections, analytical functions, and materialized views. Created and managed tables, views, constraints, and indexes.
- Experienced in CI/CD processes using Jenkins and SourceTree.
- Proficient with ServiceNow, Confluence, Bitbucket, and JIRA.
Preferred Qualifications
- Experience integrating EDL with modern lakehouse platforms (Databricks, Snowflake, Synapse, BigQuery). 
- Understanding of machine learning pipelines and real-time analytics use cases. 
- Exposure to data mesh or domain-driven data architectures. 
- Certifications in Hadoop, Cloudera, AWS, or Azure data services.